Introduction
Dictionary
Set
We’ve discussed three built-in sequence collections — strings
, lists
and tuples
.
Now, we consider the built-in non-sequence collections — dictionaries
and sets
.
A dictionary
is an unordered collection which stores key–value pairs that map immutable keys to values, just as a conventional dictionary maps words to definitions.
A set
is an unordered collection of unique immutable elements.
Like a list
, a dictionary
is a mutable collection of many values, but more general. In a list
, the index positions have to be integers; in a dictionary
, the indices can be any immutable data type.
You can think of a dictionary as a mapping between a set of indices (which are called keys) and a set of values. Each key maps to a value. The association of a key and a value is called a key-value pair or sometimes an item.
A dictionary
’s keys must be immutable (such as strings
, integers
or tuples
) and unique (that is, no duplicates). However, multiple keys can have the same value.
As an example, we’ll build a dictionary that maps from subjects to grades, so the keys are string
while the values are integers
. The function dict
creates a new dictionary with no items.
type({}) # {} also treated as dict in Python
dict
grade = dict()
type(grade), grade
(dict, {})
To add/update items to the dictionary, you can again use subscript operator (square brackets):
grade['calculus'] = 85 # Key:'calculus', value: 85
print(grade) # Note that key and value are separate by colon
{'calculus': 85}
You can create a dictionary
that contains multiple items by enclosing in curly braces, {}
, a comma-separated list of key–value pairs, each of the form key:value
.
grade = {'calculus':85, 'introduction to mathematics':80, 'computer programming':90, 'linear algebra':95}
grade
{'calculus': 85, 'introduction to mathematics': 80, 'computer programming': 90, 'linear algebra': 95}
Note that you can store them using separate lists
for subjects and scores, but the following update and maintenance will become tedious:
subjects = ['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra']
score = [85, 80, 90, 95]
You can now use the keys to look up the corresponding values:
print(grade['computer programming'])
90
You can obtain the number of items using len()
len(grade)
4
Trying to access a key that does not exist in a dictionary
will result in a KeyError
error message, much like a list’s
“out-of-range” IndexError
error message.
grade['English']
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_35416\3219575328.py in <module> ----> 1 grade['English'] KeyError: 'English'
To add or delete an entry, it is similar to list
grade['English'] = 100
grade
{'calculus': 85, 'introduction to mathematics': 80, 'computer programming': 90, 'linear algebra': 95, 'English': 100}
You can delete a key–value pair from a dictionary with the del
statement:
del grade['English']
grade
{'calculus': 85, 'introduction to mathematics': 80, 'computer programming': 90, 'linear algebra': 95}
display_quiz(path+"dict1.json", max_width=800)
keys()
, values()
, and items()
Methods¶There are three dictionary
methods that will return list
-like values of the dictionary
’s keys, values, or both keys and values: keys()
, values()
, and items()
.
The values returned by these methods are not true lists, but these data types (dict_keys
, dict_values
, and dict_items
, respectively) can be used in for
loops (Just like range
object)!
If you want a true list from one of these methods, pass its list-like return value to the list()
function
subject = list(grade.keys())
score = list(grade.values())
print(subject)
print(score)
['calculus', 'introduction to mathematics', 'computer programming', 'linear algebra'] [85, 80, 90, 95]
for v in grade.values():
print(v)
85 80 90 95
Here, a for
loop iterates over each of the values in the grade
dictionary. A for
loop can also iterate over the keys:
for k in grade.keys():
print(k)
calculus introduction to mathematics computer programming linear algebra
for k in grade:
print(k)
calculus introduction to mathematics computer programming linear algebra
Note that by default, it will traverse over the keys!
Dictionaries have a method called items()
that returns a list of tuples, where each tuple is a key-value pair:
list(grade.items())
[('calculus', 85), ('introduction to mathematics', 80), ('computer programming', 90), ('linear algebra', 95)]
Combining items()
, multiple assignment, and for
, you can see a nice code pattern for traversing the keys and values of a dictionary in a single loop:
for key, val in grade.items():
print(key,val)
calculus 85 introduction to mathematics 80 computer programming 90 linear algebra 95
display_quiz(path+"dict2.json", max_width=800)
Recall from the previous chapter that the in
and not in
operators can check whether a value exists in a list. You can also use these operators to see whether a certain key or value exists in a dictionary
'calculus' in grade, 'English' in grade.keys(), 85 in grade.values()
(True, False, True)
get()
Method¶if 'English' in grade:
e_score= grade['English']
It’s tedious to check whether a key exists in a dictionary
before accessing that key’s value. Fortunately, dictionaries have a get()
method that takes two arguments: the key of the value to retrieve and a fallback value to return if that key does not exist.
picnicItems = {'apples': 5, 'cups': 2}
print('I am bringing ' + str(picnicItems.get('cups', 0)) + ' cups.')
print('I am bringing ' + str(picnicItems.get('eggs', 0)) + ' eggs.')
I am bringing 2 cups. I am bringing 0 eggs.
Because there is no 'eggs' key in the picnicItems
dictionary, the default value 0 is returned by the get()
method. Without using get()
, the code would have caused a KeyError
message
picnicItems = {'apples': 5, 'cups': 2}
'I am bringing ' + str(picnicItems['eggs']) + ' eggs.'
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_35416\2486019301.py in <module> 1 picnicItems = {'apples': 5, 'cups': 2} ----> 2 'I am bringing ' + str(picnicItems['eggs']) + ' eggs.' KeyError: 'eggs'
display_quiz(path+"get.json", max_width=800)
setdefault()
Method¶You’ll often have to set a value in a dictionary
for a certain key only if that key does not already have a value. The code looks something like this:
spam = {'name': 'Pooka', 'age': 5}
if 'color' not in spam:
spam['color'] = 'black'
The setdefault()
method offers a way to do this in one line of code. The first argument passed to the method is the key to check for, and the second argument is the value to set at that key if the key does not exist.
The setdefault()
method is a nice shortcut to ensure that a key exists. Here is a short program that counts the number of occurrences of each letter in a string
.
message = 'It was a bright cold day in April, and the clocks were striking thirteen.'
count = {}
for character in message:
if character not in count:
count[character] = 0
count[character] = count[character] + 1
print(count)
{'I': 1, 't': 6, ' ': 13, 'w': 2, 'a': 4, 's': 3, 'b': 1, 'r': 5, 'i': 6, 'g': 2, 'h': 3, 'c': 3, 'o': 2, 'l': 3, 'd': 3, 'y': 1, 'n': 4, 'A': 1, 'p': 1, ',': 1, 'e': 5, 'k': 2, '.': 1}
message = 'It was a bright cold day in April, and the clocks were striking thirteen.'
count = {}
for character in message:
count.setdefault(character, 0)
count[character] = count[character] + 1
print(count)
{'I': 1, 't': 6, ' ': 13, 'w': 2, 'a': 4, 's': 3, 'b': 1, 'r': 5, 'i': 6, 'g': 2, 'h': 3, 'c': 3, 'o': 2, 'l': 3, 'd': 3, 'y': 1, 'n': 4, 'A': 1, 'p': 1, ',': 1, 'e': 5, 'k': 2, '.': 1}
You can view the execution of this program at https://autbor.com/setdefault. The program loops over each character in the message
variable’s string, counting how often each character appears. The setdefault()
method ensures that the key is in the count
dictionary
(with a default value of 0) so the program doesn’t throw a KeyError
error when count[character] = count[character] + 1
is executed!
From the output, you can see that the lowercase letter c appears 3 times, the space character appears 13 times, and the uppercase letter A appears 1 time.
display_quiz(path+"setdefault.json", max_width=800)
Dictionary comprehensions provide a convenient notation for quickly generating dictionaries
, often by mapping one dictionary
to another. For example, in a dictionary
with unique values, you can reverse the key–value pairs:
months = {'January': 1, 'February': 2, 'March': 3}
months2 = {number:name for name, number in months.items()}
months2
{1: 'January', 2: 'February', 3: 'March'}
A dictionary comprehension also can map a dictionary
’s values to new values. The following comprehension converts a dictionary
of names and lists
of grades into a dictionary
of names and grade-point averages. The variables k
and v
commonly mean key and value:
grades = {'Sue': [98, 87, 94], 'Bob': [84, 95, 91]}
grades2 = {k:sum(v)/len(v) for k, v in grades.items()}
grades2
{'Sue': 93.0, 'Bob': 90.0}
Consider a game featuring aliens that can have different colors and point values. This simple dictionary
stores information about a particular alien:
alien_0 = {'color': 'green', 'points': 5}
The alien_0
dictionary contains a variety of information about one alien, but it has no room to store information about a second alien, much less a screen full of aliens. How can you manage a fleet of aliens? One way is to make a list of aliens in which each alien is a dictionary
of information about that alien.
aliens = []
# Make 30 green aliens.
for alien_number in range(30):
new_alien = {'color': 'green', 'points': 5, 'speed': 'slow'}
aliens.append(new_alien)
# Show the first 5 aliens.
for alien in aliens[:5]:
print(alien)
print("...")
# Show how many aliens have been created.
print(f"Total number of aliens: {len(aliens)}")
{'color': 'green', 'points': 5, 'speed': 'slow'} {'color': 'green', 'points': 5, 'speed': 'slow'} {'color': 'green', 'points': 5, 'speed': 'slow'} {'color': 'green', 'points': 5, 'speed': 'slow'} {'color': 'green', 'points': 5, 'speed': 'slow'} ... Total number of aliens: 30
These aliens all have the same characteristics, but Python
considers each one a separate object, which allows us to modify each alien individually. How might you work with a group of aliens like this?
Imagine that one aspect of a game has some aliens changing color and moving faster as the game progresses. When it’s time to change colors, we can use a for
loop and an if
statement to change the color of the aliens. For example, to change the first three aliens to yellow, medium-speed aliens worth 10 points each, we could do this:
for alien in aliens[:3]:
if alien['color'] == 'green':
alien['color'] = 'yellow'
alien['speed'] = 'medium'
alien['points'] = 10
aliens[:10]
[{'color': 'yellow', 'points': 10, 'speed': 'medium'}, {'color': 'yellow', 'points': 10, 'speed': 'medium'}, {'color': 'yellow', 'points': 10, 'speed': 'medium'}, {'color': 'green', 'points': 5, 'speed': 'slow'}, {'color': 'green', 'points': 5, 'speed': 'slow'}, {'color': 'green', 'points': 5, 'speed': 'slow'}, {'color': 'green', 'points': 5, 'speed': 'slow'}, {'color': 'green', 'points': 5, 'speed': 'slow'}, {'color': 'green', 'points': 5, 'speed': 'slow'}, {'color': 'green', 'points': 5, 'speed': 'slow'}]
A set
is an unordered collection of unique values. Sets
may contain only immutable objects, like strings
, ints
, floats
and tuples
that contain only immutable elements.
The following code creates a set
of strings named colors
:
colors = {'red', 'orange', 'yellow', 'green', 'red', 'blue'} # Similar to set in math
colors
{'blue', 'green', 'orange', 'red', 'yellow'}
Notice that the duplicate string 'red'
was ignored (without causing an error). An important use of sets
is duplicate elimination, which is automatic when creating a set
. Also, the resulting set’s
values may not be displayed in the same order as they were listed! Though the color names are displayed in sorted order, sets are unordered. You should not write code that depends on the order of their elements!
Though sets
are iterable, they are not sequences and do not support indexing and slicing with square brackets, []
.
colors[0]
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_35416\4110793329.py in <module> ----> 1 colors[0] TypeError: 'set' object is not subscriptable
You can determine the number of items in a set with the built-in len()
function:
len(colors)
5
You can check whether a set
contains a particular value using the in
and not in
operators:
'red' in colors, 'purple' not in colors
(True, True)
Sets
are iterable, so you can process each set element with a for
loop:
for color in colors: # {'blue', 'green', 'orange', 'red', 'yellow'}
print(color, end=' ')
red yellow orange green blue
Set
with the Built-In set()
Function¶You can create a set
from another collection of values by using the built-in set()
function — here we create a list
that contains several duplicate integer values and use that list
as set
’s argument:
numbers = list(range(10)) + list(range(5))
numbers
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
set(numbers)
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
If you need to create an empty set
, you must use the set()
function with empty parentheses, rather than empty braces, {}
, which represent an empty dictionary
:
set()
set()
Python displays an empty set
as set()
to avoid confusion with Python’s string representation of an empty dictionary
({}
).
Set
Operators and Methods¶Sets
are mutable — you can add and remove elements, but set elements must be immutable. Therefore, a set
cannot have other sets
as elements.
{6, 5, 'a'}
{5, 6, 'a'}
{7, 3, {3,5,7}}
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_35416\3783331440.py in <module> ----> 1 {7, 3, {3,5,7}} TypeError: unhashable type: 'set'
Here we first discuss operators and methods that modify an existing set
.
Set method update()
performs a union operation modifying the set in-place — the argument can be any iterable:
numbers = {1, 3, 5}
numbers.update(range(10))
numbers
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Set
method add()
inserts its argument if the argument is not already in the set; otherwise, the
set
remains unchanged:
numbers.add(17)
numbers.add(3)
numbers
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 17}
Set
method remove()
removes its argument from the set
— a KeyError
occurs if the value is not in the set
:
numbers.remove(3)
numbers
{0, 1, 2, 4, 5, 6, 7, 8, 9, 17}
numbers.remove(11)
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_35416\550142172.py in <module> ----> 1 numbers.remove(11) KeyError: 11
display_quiz(path+"set.json", max_width=800)
Like dictionary comprehensions, you define set comprehensions in curly braces. Let’s create a new set
containing only the unique even values in the list
numbers:
numbers = [1, 2, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 10]
evens = {item for item in numbers if item % 2 == 0}
evens
{2, 4, 6, 8, 10}
set
and dictionary
¶As we mentioned last week, data types like tuples
don’t provide methods like sort()
. However Python
provides the built-in function sorted()
, which takes any sequence as a parameter and returns a new container with the same elements in a different order. You can also apply sorted
to the set, but the returning container will be list
.
help(sorted)
Help on built-in function sorted in module builtins: sorted(iterable, /, *, key=None, reverse=False) Return a new list containing all items from the iterable in ascending order. A custom key function can be supplied to customize the sort order, and the reverse flag can be set to request the result in descending order.
RANKS = ["A", "2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K"]
def rank_key(card):
return RANKS.index(card)
ori_set = {"A", "2", "7", "4", "Q"}
print(sorted(ori_set))
sorted_list = sorted(ori_set, key=rank_key)
# Each element will be replaced by the output of rank_key() and sorts!
print(sorted_list)
['2', '4', '7', 'A', 'Q'] ['A', '2', '4', '7', 'Q']
Note that we have changed the behavior of the sorted()
function by providing the custom key that allows us to sort the data in a specific order using the predefined list
and the index()
function.
If you would like to sort the dictionary
, you need to use the items()
method (Otherwise, it will only return keys). The returning container will again be a list
:
grade = {'calculus':85, 'introduction to mathematics':80, 'introduction to computer science':90, 'linear algebra':95}
sorted_list = sorted(grade.items())
print(sorted_list)
[('calculus', 85), ('introduction to computer science', 90), ('introduction to mathematics', 80), ('linear algebra', 95)]
If you would like to sort by the value, use the following code:
def value_key(x):
return x[1]
grade = {'calculus':85, 'introduction to mathematics':80, 'computer programming':90, 'linear algebra':95}
sorted_dict = sorted(grade.items(), key=value_key)
print(sorted_dict)
[('introduction to mathematics', 80), ('calculus', 85), ('computer programming', 90), ('linear algebra', 95)]
Pokemon.json
. Each item in the list is a dictionary that stores the name of the Pokemon as well as the species' strength. Complete the program by filling the missing part that reads the data of Pokemon from the Pokemon.json
and displays the name with total species strength (summation of HP, Attack, Defense, Sp. Attack, Sp. Defense and Speed) line by line sorted by total species strength.¶import json
# 1. Load the JSON data from the file and reads contents to pokemon_data
with open('Pokemon.json', 'r') as file:
pokemon_data = json.load(file)
# 2. Calculate the total species strength for each Pokemon and add it to pokemon_data
for pokemon in pokemon_data:
# Your code here
_____________________________
pokemon['total_strength'] = total_strength
# 3. Sort the Pokemon data by total species strength
def value_key(x):
return x['total_strength']
# Your code here
sorted_pokemon_data = sorted(pokemon_data, ________, ________)
# 4. Display the sorted Pokemon data
for pokemon in sorted_pokemon_data:
print(________+ "has a total species strength of"+ ________)
In this chapter, we discussed Python’s dictionary
and set
collections. They are both unorder, mutable and do not allow duplicates.
A short comparison of the containers is shown below:
Feature | List | Tuple | Dictionary | Set |
---|---|---|---|---|
Mutable (Can be modified in place) | Yes | No | Yes (keys are immutable) | Yes |
Iterable (Can be use in for loop) | Yes | Yes | Yes | Yes |
Ordered (Can access by index, slicing) | Yes | Yes | No | No |
Duplicate Values | Allowed | Allowed | Not in keys | Not allowed |
from jupytercards import display_flashcards
fpath= "https://raw.githubusercontent.com/phonchi/nsysu-math106A/refs/heads/main/extra/flashcards/"
display_flashcards(fpath + 'ch5.json')